AITopics | search stage

Collaborating Authors

search stage

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

4e392aa9bc70ed731d3c9c32810f92fb-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 20:37:49 GMT

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Vancouver (0.04)
Europe > Austria (0.04)
Asia > China (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Information Management (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Long-range Meta-path Search on Large-scale Heterogeneous Graphs Chao Li

Neural Information Processing SystemsOct-10-2025, 01:59:08 GMT

Our code is available at https://github.com/JHL-HUST/LMSPS . Utilizing long-range dependency is essential for graph representation learning.

dataset, heterogeneous graph, maximum hop, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Vancouver (0.04)
Europe > Austria (0.04)
Asia > China (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Information Management (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

CompassLLM: A Multi-Agent Approach toward Geo-Spatial Reasoning for Popular Path Query

Ananto, Md. Nazmul Islam, Fatin, Shamit, Ali, Mohammed Eunus, Parvez, Md Rizwan

arXiv.org Artificial IntelligenceOct-10-2025

The popular path query - identifying the most frequented routes between locations from historical trajectory data - has important applications in urban planning, navigation optimization, and travel recommendations. While traditional algorithms and machine learning approaches have achieved success in this domain, they typically require model training, parameter tuning, and retraining when accommodating data updates. As Large Language Models (LLMs) demonstrate increasing capabilities in spatial and graph-based reasoning, there is growing interest in exploring how these models can be applied to geo-spatial problems. We introduce CompassLLM, a novel multi-agent framework that intelligently leverages the reasoning capabilities of LLMs into the geo-spatial domain to solve the popular path query. CompassLLM employs its agents in a two-stage pipeline: the SEARCH stage that identifies popular paths, and a GENERATE stage that synthesizes novel paths in the absence of an existing one in the historical trajectory data. Experiments on real and synthetic datasets show that CompassLLM demonstrates superior accuracy in SEARCH and competitive performance in GENERATE while being cost-effective.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.07516

Country: Asia (0.46)

Genre: Research Report (0.64)

Industry: Transportation (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

ShuffleGate: An Efficient and Self-Polarizing Feature Selection Method for Large-Scale Deep Models in Industry

Huang, Yihong, Chu, Chen, Zhang, Fan, Chen, Fei, Lin, Yu, Li, Ruiduan, Li, Zhihao

arXiv.org Artificial IntelligenceMar-18-2025

Deep models in industrial applications rely on thousands of features for accurate predictions, such as deep recommendation systems. While new features are introduced to capture evolving user behavior, outdated or redundant features often remain, significantly increasing storage and computational costs. To address this issue, feature selection methods are widely adopted to identify and remove less important features. However, existing approaches face two major challenges: (1) they often require complex hyperparameter (Hp) tuning, making them difficult to employ in practice, and (2) they fail to produce well-separated feature importance scores, which complicates straightforward feature removal. Moreover, the impact of removing unimportant features can only be evaluated through retraining the model, a time-consuming and resource-intensive process that severely hinders efficient feature selection. To solve these challenges, we propose a novel feature selection approach, ShuffleGate. In particular, it shuffles all feature values across instances simultaneously and uses a gating mechanism that allows the model to dynamically learn the weights for combining the original and shuffled inputs. Notably, it can generate well-separated feature importance scores and estimate the performance without retraining the model, while introducing only a single Hp. Experiments on four public datasets show that our approach outperforms state-of-the-art methods in feature selection for model retraining. Moreover, it has been successfully integrated into the daily iteration of Bilibili's search models across various scenarios, where it significantly reduces feature set size (up to 60%+) and computational resource usage (up to 20%+), while maintaining comparable performance.

artificial intelligence, machine learning, shufflegate, (18 more...)

arXiv.org Artificial Intelligence

2503.09315

Country:

Asia > China > Shanghai > Shanghai (0.05)
Asia > China > Guangdong Province > Guangzhou (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)

Add feedback

Neural Architecture Codesign for Fast Physics Applications

Weitz, Jason, Demler, Dmitri, McDermott, Luke, Tran, Nhan, Duarte, Javier

arXiv.org Artificial IntelligenceJan-9-2025

We develop a pipeline to streamline neural architecture codesign for physics applications to reduce the need for ML expertise when designing models for novel tasks. Our method employs neural architecture search and network compression in a two-stage approach to discover hardware efficient models. This approach consists of a global search stage that explores a wide range of architectures while considering hardware constraints, followed by a local search stage that fine-tunes and compresses the most promising candidates. We exceed performance on various tasks and show further speedup through model compression techniques such as quantization-aware-training and neural network pruning. We synthesize the optimal models to high level synthesis code for FPGA deployment with the hls4ml library. Additionally, our hierarchical search space provides greater flexibility in optimization, which can easily extend to other tasks and domains. We demonstrate this with two case studies: Bragg peak finding in materials science and jet classification in high energy physics, achieving models with improved accuracy, smaller latencies, or reduced resource utilization relative to the baseline models.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2501.05515

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry:

Government > Regional Government (0.46)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Adaptive Pruning of Pretrained Transformer via Differential Inclusions

Ding, Yizhuo, Fan, Ke, Wang, Yikai, Sun, Xinwei, Fu, Yanwei

arXiv.org Artificial IntelligenceJan-6-2025

Large transformers have demonstrated remarkable success, making it necessary to compress these models to reduce inference costs while preserving their perfor-mance. Current compression algorithms prune transformers at fixed compression ratios, requiring a unique pruning process for each ratio, which results in high computational costs. In contrast, we propose pruning of pretrained transformers at any desired ratio within a single pruning stage, based on a differential inclusion for a mask parameter. This dynamic can generate the whole regularization solution path of the mask parameter, whose support set identifies the network structure. Therefore, the solution path identifies a Transformer weight family with various sparsity levels, offering greater flexibility and customization. In this paper, we introduce such an effective pruning method, termed SPP (Solution Path Pruning). To achieve effective pruning, we segment the transformers into paired modules, including query-key pairs, value-projection pairs, and sequential linear layers, and apply low-rank compression to these pairs, maintaining the output structure while enabling structural compression within the inner states. Extensive experiments conducted on various well-known transformer backbones have demonstrated the efficacy of SPP.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.03289

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

Adaptive Channel Allocation for Robust Differentiable Architecture Search

Li, Chao, Ning, Jia, Hu, Han, He, Kun

arXiv.org Artificial IntelligenceDec-22-2024

Differentiable ARchiTecture Search (DARTS) has attracted much attention due to its simplicity and significant improvement in efficiency. However, the excessive accumulation of the skip connection, when training epochs become large, makes it suffer from weak stability and low robustness, thus limiting its practical applications. Many works have attempted to restrict the accumulation of skip connections by indicators or manual design. These methods, however, are susceptible to human priors and hyper-parameters. In this work, we suggest a more subtle and direct approach that no longer explicitly searches for skip connections in the search stage, based on the paradox that skip connections were proposed to guarantee the performance of very deep networks, but the networks in the search stage of differentiable architecture search are actually very shallow. Instead, by introducing channel importance ranking and channel allocation strategy, the skip connections are implicitly searched and automatically refilled unimportant channels in the evaluation stage. Our method, dubbed Adaptive Channel Allocation (ACA) strategy, is a general-purpose approach for differentiable architecture search, which universally works in DARTS variants without introducing human priors, indicators, or hyper-parameters. Extensive experiments on various datasets and DARTS variants verify that the ACA strategy is the most effective one among existing methods in improving robustness and dealing with the collapse issue when training epochs become large.

artificial intelligence, machine learning, skip connection, (17 more...)

arXiv.org Artificial Intelligence

2204.04681

Country:

Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)

Add feedback

Deep Memory Search: A Metaheuristic Approach for Optimizing Heuristic Search

Hedar, Abdel-Rahman, Abdel-Hakim, Alaa E., Deabes, Wael, Alotaibi, Youseef, Bouazza, Kheir Eddine

arXiv.org Artificial IntelligenceOct-22-2024

Metaheuristic search methods have proven to be essential tools for tackling complex optimization challenges, but their full potential is often constrained by conventional algorithmic frameworks. In this paper, we introduce a novel approach called Deep Heuristic Search (DHS), which models metaheuristic search as a memory-driven process. DHS employs multiple search layers and memory-based exploration-exploitation mechanisms to navigate large, dynamic search spaces. By utilizing model-free memory representations, DHS enhances the ability to traverse temporal trajectories without relying on probabilistic transition models. The proposed method demonstrates significant improvements in search efficiency and performance across a range of heuristic optimization problems.

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.17042

Country:

North America > United States (0.46)
Asia > Middle East (0.28)

Genre:

Overview (0.67)
Research Report > Promising Solution (0.34)

Industry: Energy > Oil & Gas > Upstream (0.48)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Long-Sequence Recommendation Models Need Decoupled Embeddings

Feng, Ningya, Pan, Junwei, Wu, Jialong, Chen, Baixu, Wang, Ximei, Li, Qian, Hu, Xian, Jiang, Jie, Long, Mingsheng

arXiv.org Artificial IntelligenceOct-3-2024

Lifelong user behavior sequences, comprising up to tens of thousands of history behaviors, are crucial for capturing user interests and predicting user responses in modern recommendation systems. A two-stage paradigm is typically adopted to handle these long sequences: a few relevant behaviors are first searched from the original long sequences via an attention mechanism in the first stage and then aggregated with the target item to construct a discriminative representation for prediction in the second stage. In this work, we identify and characterize, for the first time, a neglected deficiency in existing long-sequence recommendation models: a single set of embeddings struggles with learning both attention and representation, leading to interference between these two processes. Initial attempts to address this issue using linear projections -- a technique borrowed from language processing -- proved ineffective, shedding light on the unique challenges of recommendation models. To overcome this, we propose the Decoupled Attention and Representation Embeddings (DARE) model, where two distinct embedding tables are initialized and learned separately to fully decouple attention and representation. Extensive experiments and analysis demonstrate that DARE provides more accurate search of correlated behaviors and outperforms baselines with AUC gains up to 0.9% on public datasets and notable online system improvements. Furthermore, decoupling embedding spaces allows us to reduce the attention embedding dimension and accelerate the search procedure by 50% without significant performance impact, enabling more efficient, high-performance online serving.

dataset, dimension, representation, (16 more...)

arXiv.org Artificial Intelligence

2410.02604

Country: Asia > China > Heilongjiang Province > Daqing (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.49)

Add feedback

Learning programs with numerical reasoning

AIHubJun-13-2024, 10:29:03 GMT

Drug design is the process of identifying molecules responsible for medicinal activity. Suppose we want to automate drug design with machine learning. To do so, we would like to automatically learn programs which explain why a molecule is active or inactive. For instance, as illustrated in the figure above, a program might determine that a molecule is active if it contains a hydrogen atom with a charge greater than 0.2C, and located within 0.1 angstroms of a carbon atom. Discovering this program involves identifying the numerical values 0.2 and 0.1.

artificial intelligence, logic & formal reasoning, machine learning, (16 more...)

AIHub

Industry: Education (0.44)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.77)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.58)

Add feedback